GIR-based ensemble sampling approaches for imbalanced learning
نویسندگان
چکیده
منابع مشابه
Ensemble-based hybrid probabilistic sampling for imbalanced data learning in lung nodule CAD
Classification plays a critical role in false positive reduction (FPR) in lung nodule computer aided detection (CAD). The difficulty of FPR lies in the variation of the appearances of the nodules, and the imbalance distribution between the nodule and non-nodule class. Moreover, the presence of inherent complex structures in data distribution, such as within-class imbalance and high-dimensionali...
متن کاملCluster-based under-sampling approaches for imbalanced data distributions
For classification problem, the training data will significantly influence the classification accuracy. However, the data in real-world applications often are imbalanced class distribution, that is, most of the data are in majority class and little data are in minority class. In this case, if all the data are used to be the training data, the classifier tends to predict that most of the incomin...
متن کاملOnline Ensemble Learning for Imbalanced Data Streams
While both cost-sensitive learning and online learning have been studied extensively, the effort in simultaneously dealing with these two issues is limited. Aiming at this challenge task, a novel learning framework is proposed in this paper. The key idea is based on the fusion of online ensemble algorithms and the state of the art batch mode cost-sensitive bagging/boosting algorithms. Within th...
متن کاملMargin-Based Over-Sampling Method for Learning from Imbalanced Datasets
Learning from imbalanced datasets has drawn more and more attentions from both theoretical and practical aspects. Over-sampling is a popular and simple method for imbalanced learning. In this paper, we show that there is an inherently potential risk associated with the oversampling algorithms in terms of the large margin principle. Then we propose a new synthetic over sampling method, named Mar...
متن کاملCluster-Based Sampling Approaches to Imbalanced Data Distributions
For classification problem, the training data will significantly influence the classification accuracy. When the data set is highly unbalanced, classification algorithms tend to degenerate by assigning all cases to the most common outcome. Hence, it is important to select the suitable training data for classification in the imbalanced class distribution problem. In this paper, we propose cluste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition
سال: 2017
ISSN: 0031-3203
DOI: 10.1016/j.patcog.2017.06.019